SemanticScuttle - klotz.me » Tags: code interpreter+llm+reinforcement learning

Tags: code interpreter* + llm* + reinforcement learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level

Details the development and release of DeepCoder-14B-Preview, a 14B parameter code reasoning model achieving performance comparable to o3-mini through reinforcement learning, along with the dataset, code, and system optimizations used in its creation.

2025-04-09 Tags: deepcoder, llm, reinforcement learning, coding, open source, deepseek, code interpreter by klotz

Training Large Language Models with Interpreter Feedback using WebAssembly

This article details a method for training large language models (LLMs) for code generation using a secure, local WebAssembly-based code interpreter and reinforcement learning with Group Relative Policy Optimization (GRPO). It covers the setup, training process, evaluation, and potential next steps.

2025-04-04 Tags: huggingface, llm, training, code generation, webassembly, wasm, grpo, reinforcement learning, axolotl, code interpreter, fine-tuning, python by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: code interpreter* + llm* + reinforcement learning*

Linked Tags

Related Tags